Maltilex: A Computational Lexicon for Maltese
نویسندگان
چکیده
The project described in this paper, which is still in the preliminary phase, concerns the design and implementation of a computational lexicon for Maltese, a language very much in current use but so far lacking most of the infrastructure required for NLP. One of the main characteristics of Maltese, a source of many difculties, is that it is an amalgam of di erent language types (chie y Semitic and Romance), as illustrated in the rst part of the paper. The latter part of the paper describes our general approach to the problem of constructing the lexicon.
منابع مشابه
The development of language resources for Maltese
This paper describes two aspects of the work going on to computerise resources for the Maltese language. The first part describes work on labelling and annotation of spoken Maltese to generate a database suitable for use in deriving speech and speaker recognition tools. It also describes an interactive development system SSUNN that is being used for this work. The second part describes approach...
متن کاملThe Future of Maltilex
The Maltilex project, supported by the University of Malta, has now been running for approximately 3 years. Its aim is to create a computational lexicon of Maltese to serve as the basic infrastructure for the development of a wide variety of language-enabled applications. The project is further described in Rosner et. al. (Rosner et-al 1999, Rosner et al., 1998). This paper discusses the backgr...
متن کاملCreation and Evaluation of Extensible Language Resources for Maltese
The creation of Language Resources is a labour intensive process whose difficulty is further compounded when minority languages are concerned (Cunningham, 1999). This paper discusses the creation of an extensible set of Language Resources for Maltese developed by the Maltilex Project at the University of Malta (Rosner et. al., 1999), together with quality evaluation mechanisms for minority
متن کاملVerb Morphology of Hebrew and Maltese — Towards an Open Source Type Theoretical Resource Grammar in GF
One of the first issues that a programmer must tackle when writing a complete computer program that processes natural language is how to design the morphological component. A typical morphological component should cover three main aspects in a given language: (1) the lexicon, i.e. how morphemes are encoded, (2) orthographic changes, and (3) morphotactic variations. This is in particular challen...
متن کاملAdaptation of the F-measure to Cluster Based Lexicon Quality Evaluation
An external lexicon quality measure called the L-measure is derived from the F-measure (Rijsbergen, 1979; Larsen and Aone, 1999). The typically small sample sizes available for minority languages and the evaluation of Semitic language lexicons are two main factors considered. Large-scale evaluation results for the Maltilex Corpus are presented (Rosner et
متن کامل